Using Wikipedia for term extraction in the biomedical domain: first experiences
نویسندگان
چکیده
We present a term extractor that uses Wikipedia as an semantic information source. The system has been tested on a Spanish medical corpus. We compare the results obtained using a module of a hybrid term extractor and an equivalent module that use the Wikipedia. The results show that this resource may be used for this task.
منابع مشابه
Extracting terminology from Wikipedia
In this paper we present a new approach for obtaining the terminology of a given domain using the category and page structures of the Wikipedia in a domain and language independent way. The idea is to take profit of category graph of Wikipedia starting with a set of categories that we associate with the domain. After obtaining the full set of categories belonging to the selected domain, the col...
متن کاملTerm Validation for Vocabulary Construction and Key Term Extraction
We extract new terminology from a text by term validation in a dictionary. Our approach is based on estimating probabilities for previously unseen terms, i.e. not present in a dictionary. To do this we apply several probabilistic models previously not used for term recognition and propose a new one. We apply restriction of domain similarity on terms used for probability estimation and vary the ...
متن کاملMining and Ranking Biomedical Synonym Candidates from Wikipedia
Biomedical synonyms are important resources for Natural Language Processing in Biomedical domain. Existing synonym resources (e.g., the UMLS) are not complete. Manual efforts for expanding and enriching these resources are prohibitively expensive. We therefore develop and evaluate approaches for automated synonym extraction from Wikipedia. Using the inter-wiki links, we extracted the candidate ...
متن کاملINRIASAC: Simple Hypernym Extraction Methods
For information retrieval, it is useful to classify documents using a hierarchy of terms from a domain. One problem is that, for many domains, hierarchies of terms are not available. The task 17 of SemEval 2015 addresses the problem of structuring a set of terms from a given domain into a taxonomy without manual intervention. Here we present some simple taxonomy structuring techniques, such as ...
متن کاملA Pipeline for Supervised Formal Definition Generation
Ontologies play a major role in life sciences, enabling a number of applications. Obtaining formalized knowledge from unstructured data is especially relevant for biomedical domain, since the amount of textual biomedical data has been growing exponentially. The aim of this paper is to develop a method of creating formal definitions for biomedical concepts using textual information from scientif...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Procesamiento del Lenguaje Natural
دوره 45 شماره
صفحات -
تاریخ انتشار 2010